Modifying web page code to include code to protect output

ABSTRACT

Examples disclosed herein relate to modifying a web page. In one example, in response to beginning execution of a process initiating generation of a web page of a web application at a server, a runtime agent is executed. In this example, the runtime agent modifies code of the web page to inject code to protect output of the web page. In the example, the process can be executed using the modified code to generate a modified web page.

BACKGROUND

Web applications are implemented by a combination of hardware andsoftware and run in a web browser. Applications are a popular target forattackers. Applications can have vulnerabilities susceptible to attacks.Examples of attacks on applications include cross-site scripting.Cross-site scripting enables attackers to inject a client-side scriptinto web pages viewed by others.

BRIEF DESCRIPTION OF THE DRAWINGS

The following detailed description references the drawings, wherein:

FIG. 1 is a block diagram of a computing system capable of modifyingcode of a web application to generate a modified web page, according toan example;

FIG. 2 is a block diagram of a system including a computing systemcapable of modifying code of a web application to generate a modifiedweb page, according to an example;

FIG. 3 is a flowchart of a method for modifying code of a web page toinclude code to protect output, according to an example;

FIG. 4 is a block diagram of a computing device capable of modifyingcode of a web application to generate a modified web page with injectedcode to protect output of the modified web page, according to anexample;

FIG. 5 is a flowchart of a method to provide a modified web page capableof being used by a browser capable of implementing a security policy,according to an example;

FIG. 6 is a flowchart of a method for modifying code of a web page toinclude context sensitive validation, according to an example;

FIG. 7 is a diagram of an unmodified section of code that includesoutput at a web page, according to an example; and

FIG. 8 is a diagram of a modified section of code that includes outputat a web page, according to an example.

DETAILED DESCRIPTION

Web applications are implemented by a combination of hardware andsoftware. Web applications may be run in web browsers. Access isprovided by web application servers. Web applications have becomepopular due to the popularity of web browsers, which can be used asclients for the applications. However, web applications are the targetof various attacks.

As the number and complexity of enterprise web applications grows, theattack surface for exploits increases, leaving enterprises exposed.Traditional methods of protecting web applications can take significanttime to implement and may be focused on the development stage ofsoftware. These methods often do not protect web applications running inproduction. Additionally, with a majority of successful breachesoccurring at the application layer, installing simple perimeter defenseto protect production software may lack effectiveness.

Attacks that allow a client to modify a web page accessed by anotherclient, such as cross-site scripting (XSS), are common web applicationvulnerabilities. XSS allows an attacker to inject HyperText MarkupLanguage (HTML) scripts such as JavaScript and VBScript into web pagesviewed by other users. In some examples, the injected script can by-passaccess control, steal user credentials, execute arbitrary harmfulscripts under the user's security context, and the like.

One approach to protect against XSS is input validation. However, it ischallenging to maintain accurate whitelists for all parameters. Further,basing input validation on blacklisting is susceptible to falsenegatives.

Accordingly, approaches disclosed herein help protect against web pageoutput attacks such as XSS by adding code to protect a web application.A runtime agent acts upon the web application by including code atcertain points of the web application to call the runtime agent toexecute. Processes to generate web pages are intercepted when a web pageto be served is generated. The runtime agent determines a contextassociated with output of the web pages and determines code to inject toprotect the output, where the source code is associated with thecontext. As used herein, context is the setting in which the output isused. The context can be determined by an identifier as noted below. Thecode to inject to protect the output is added to the source code beforegenerating the web page. According to examples described herein, thesource code includes information used to generate a web page. The sourcecode can include content and/or references to content.

As described herein, a runtime agent is a combination of programming andhardware to watch the internal operations performed by the webapplication during execution of the web application. The runtime agentcan be called when particular Application Programming Interfaces (APIs)are called (e.g., where the particular APIs include the code to call theruntime agent). As used herein, an API specifies a set of functions orroutines that accomplish a specific task. The runtime agent can becalled when a process to generate the web page is initiated and canmodify the web page to include additional code injected to protectoutput of the respective web pages before the web page is generated.

In some examples, the code injected to protect the output relates tocontext sensitive validation, such as Context Sensitive Encoding (CSE).The context sensitive validation can be based on a data structure, suchas a table indicating a context sensitive validation to implement basedon an identifier in the source code or multiple identifiers. A set ofpossible identifiers for identifying context in a web page can be storedin a data structure associated with the runtime agent. The identifierscan include identifiers that open a context (e.g., <script>, <form>,etc.) and identifiers that close a context (e.g., </script>, </form>,etc.). The identifiers in the data structure can be compared to thesource code to identify the associated context. A context is the settingthat output of a webpage is in. Context can be considered as theinformation of any particular location inside the HTML document. Forexample, if the location is a HTML node element, then the relevantinformation can be all parent/child/sibling nodes; or if the location isinside a HTML attribute, then the relevant information can be the nodename, other node attributes and child nodes; or if the location isinside a script tag, then the relevant information can be whether it isa quoted text or not. For example, if a first identifier <html> followedby a second identifier <body> precedes the output and the output isfollowed by </body> and </html>, the context of the output is HTML text.In this case, the content between <body> and </body> is the output. Inone example, an identifier, when processed or parsed, can identify thebeginning of a context. In some examples, an identifier may include onecharacter or multiple characters. In this example, another identifiercan identify the end of the context. Examples of contexts include HTMLText, HTML Attribute, HTML Comment, HTML Attribute: URL, JavaScriptText, JavaScript String, etc. An HTML parser and/or regular expressionscan be used to determine the context based on identifiers in the code.The context is the setting that the output is within. As noted, thesetting is identified using the identifiers, for example, the contextcan be identified by usage of anchor tags, or other syntax.

In other examples, the code injected to protect the output can helpenforce a security policy implemented at an end user's web browser. Forexample, certain web browsers may implement a security policy, such asContent Security Policy (CSP). The security policy can be standardized,such as in CSP, or otherwise implemented. The security policy can beimplemented in two parts, a web application that is compliant with thepolicy and a browser that implements the policy. A web application iscompliant with the policy if it provides information that can be used bythe browser to protect the output content of the web applicationaccording to the policy. The browser can implement the security policybased on the information that a compliant web application provides. Inan example using CSP, the web application can provide a standardHypertext Transfer Protocol (HTTP) header that declares approved sourcesof content that the browser should be allowed to load on a web page. Theapproved sources of content can be implemented, for example, using awhitelist of content that is from approved sources. The browser canimplement the policy based on this information. Examples of types ofcontent include JavaScript, Cascading Style Sheets (CSS), HTML, frames,fonts, images, embeddable objects such as applet, audio files, videofiles, etc. In one example, the runtime agent can add security policyheaders to the web application source code when a page is beinggenerated during execution of the web application. As such, the policycan be implemented without need for modifying and recompiling the wholeweb application.

FIG. 1 is a block diagram of a computing system capable of modifyingcode of a web application to generate a modified web page, according toan example. FIG. 2 is a block diagram of a system including a computingsystem capable of modifying source code of a web application to generatea modified web page, according to an example.

In one example, computing system 100 can include a web application 110,a runtime agent 112, and a web page 114. In another example, thecomputing system can be part of system 200 and further include aninterface engine 216, a log 218, processor 230, memory 232, input/outputinterfaces 234, etc. In some examples, system 200 can further include acommunication network 240 that allows the computing system 100 to servethe web application 110 to devices 244 a-244 n. The computing system 100can include a physical processor(s) implementing machine readableinstructions to cause the computing system 100 to perform certainfunctionality.

When serving the web application 110, the computing system 100 canprovide an interface, via the interface engine 216 and input/outputinterfaces 234 to allow the devices 244 to access the web application110. Input/output interfaces 234 can include network interface cards, akeyboard interface, a mouse interface, other components to interact withthe computing system 100, etc. A browser 246 on the device 244 can beused to access web pages 114 of the web application 110. In certainexamples, a web page 114 provides content of the web application 110 tothe browser 246. The browser 246 can be used to cause display of the webpage 114 including output 250.

As noted, the web application 110 can be executed by a device 244 in abrowser 246. The web application 110 can be created in abrowser-supported programming language such as JAVASCRIPT, HyperTextMarkup Language (HTML), Cascading Style Sheets (CSS), etc. Further, aweb application framework (e.g., .NET, JavaEE, Apache Velocity, etc.)can be used to implement the web application 110. Examples of webapplications 110 include email applications, maps and navigationapplications, banking sites, trading sites, news sites, forums, etc. Theweb application 110 can have access to a database or multiple databases(not shown).

As noted, the web application 110 may be encoded in any suitableWeb-based computer language, such as JAVA, or .NET, among others. Theweb application 110 may operate within a suitable software framework,such as Struts. Struts 2, ASP.NET MVC, Oracle WebLogic, and Spring MVC,or the like. The software framework includes a set of common codemodules that provide generic functionality, which can be selectivelyoverridden or specialized by user code to providing specificfunctionality. The web application 110 may be configured to execute oneinstance or multiple instances of a Java Virtual Machine (JVM), CommonLanguage Runtime (CLR), or other runtime environment.

The runtime agent 112 can operate within the execution environment ofthe web application 110 and has access to the internal operationsperformed by the web application 110. For example, the runtime agent112, in certain examples, may modify the bytecode of the web application110 by injecting additional code, such as a JAVA class, at variousprogram points. The modified bytecode can be used to call the runtimeagent to perform a method. One method that can be called is a method torecord and manage security events, such as failed context sensitivevalidation or CSP violation. A second method that can be called is amethod to modify web page code. The methods that can be called are notlimited to the examples described herein.

The injected code may be added at strategic program points in the webapplication 110, for example, at application programming interface (API)calls that perform specific operations, such as generating a web page114. In one example, a strategic program point is at the start of aparticular API that is part of the process in the application frameworkof the web application 110 responsible for generating a web page fromsource code. In another example, a strategic program point is at the endof a particular API, before calling of another API or function, etc. TheAPIs can be determined by a programmer and inserted into the runtimeagent. With this approach a programmer need not look at the code of theweb application to determine the program point, instead, the programpoint is associated with the API. The runtime agent can insert theprogram points for APIs to call particular methods of the runtime agent.

A browser 246 can request a web page 114 from the web application 110.In response to the request, source code for the requested web page 114can be gathered to provide to the browser 246. A process to generate theweb page 114 can be called by the web application 110.

In various examples, the runtime agent 112 is initiated in response to astrategic program point of the web application 110 being called orexecuted. The process for generation of the web page 114 that includesthe program point can be dependent on the type of programming languageimplemented.

In certain examples, the process for generation of the web page can be acompile process or part of the compile process to generate the web page114 from source code. The compile process may include one API and/ormultiple APIs. Tomcat Java Server Page (JSP) is a technology todynamicaly generate web pages based on HTML, XML, or other documenttypes. In the example of Tomcat JSP, the compile process may include theAPI org.apache.jasper.compiler.JSPUtil.getReader( ), which may be calledto read a JSP source file and then pass the file stream reader to theJSP compiler. In this example, the runtime agent 112 can be implementedin response to the web application 110 calling JSPUtil.getReader( ) andthe runtime agent 112 can modify the data stream received via theJSPUtil.getReader( ) to include injected code (e.g., context sensitivevalidation, a security policy header to invoke a browser policyresponse, etc.) to protect output. As such, when the JSP compiler isworking, the JSP compiler can see the data stream modified by theruntime agent 112. In certain examples, the API that indicates that aweb page 114 is going to be generated (in this example compiled)includes the strategic program point to call the runtime agent 112because the runtime agent 112 previously injected code into the API andthe data can be modified before the process that generates the web page114 performs. In some examples, the process to generate the web page mayinclude access to the source code for the web page 114. The APIindicative that the web page 114 is going to be generated can bedetermined by a programmer based on the framework used to implement theweb application 110. The example shown is for the Tomcat JSP compiler,however, readers that have access to the web page code can be used.

The runtime agent 112 modifies the source code of the web page to injectcode to protect output of the web page. As mentioned above, in certainexamples, the injected code to protect output may comprise a contextsensitive validation. In other examples, the injected code can includeadding information to guide a browser 246 implementing a securitypolicy. Further, in some examples, the injected code may include bothcontext sensitive validation and security policy information asprotections. Responsive to the web page being modified with the injectedcode and the runtime agent 112 completing execution of its method,execution is returned to the web application 110 and the web application110 can then proceed to generate (e.g., compile) the web page togenerate a modified web page.

As noted above, runtime agent 112 can be used to inject code for contextsensitive validation, inject code for implementing a security policy, ora combination thereof for web pages. A service can be provided using theruntime agent 112 to protect the web application 110. The runtime agent112 can be configured to protect the web application 110 based oncontext sensitive validation, the security policy, or a combinationthereof based on the service to be provided.

In the example of context sensitive validation, the runtime agent 112determines context sensitive validation for the source code based on anidentifier in the source code related to the output. As noted above, aset of possible identifiers for identifying context in a web page 114can be stored in a data structure associated with the runtime agent 112.The identifiers can include identifiers that open a context andidentifiers that close a context. A context is the setting that outputof a webpage is in. For example, if a first identifier <html> followedby a second identifier <body> precedes the output and the output isfollowed by </body> and </html>, the context of the output is HTML text.The runtime agent 112 makes the determination based on a data structure,such as a table indicating a context sensitive validation to implementbased on the identifier in the source code. In one example, anidentifier, when processed or parsed by the runtime agent 112, can beused to identify the beginning of a context. In some examples, anidentifier may include one character or multiple characters. A parserfor the programming language used by the web application 110 can be usedto determine the context using the identifiers. In this example, anotheridentifier can identify the end of the context. Table 1 includes someexamples of validation functions for context sensitive validation toinclude based on a context of the output. As noted above, the output canbe considered content within the context.

TABLE 1 Context Validation Function HTML Text Library.checkNoTag(data)HTML Attribute Library.checkNoQuote(data) HTML CommentLibrary.checkNoDashDash(data) HTML Attribute: URLLibrary.checkIsURL(data) JavaScript Text Library.checkAlphaNumeric(data)JavaScript String Library.checkNoQuote(data)

For each output located within the respective contexts found within theweb page 114, the runtime agent 112 adds a context sensitive validation.In the examples above, the validation function can check to see ifcontent included to provide as output includes a particular feature thatis associated with a vulnerability.

In one example, the context of the output is HTML text. In this example,a vulnerability can be that XSS may be injected if an angle bracket ispresent. Thus, the validation function can check for an “<”. If a “<” ispresent in the output of the web page 114 when executed by the browser,the validation would fail.

In another example, the context of the output is an HTML attribute. Theexample may include the following code: Name: <input name=“name”value=“<%=request.getParameter(“name”) %>”>. The attribute specifies thevalue of text for the input element “name.” In this example, the outputdata is inside a HTML attribute (in this example, the value attribute).Here, an XSS attack would need a quote [“] to close a previous startquote. Without the end quote [“], an XSS attack would not be successful.Thus, the validation can check for the quote. As such, validation wouldnot be successful if the output of web page 114 when executed by thebrowser included a quote.

In a further example, the context of the output is an HTML comment. Thisexample may include the following code: <!-debug:<%=request.getParameter(“debug”) %>->. Here, the output data is insidethe HTML comment and thus would need a “->” to close the previous startcomment. Without the “->” an XSS attack would not be successful. Thus,the validation can check for that string or subset of the string (e.g.,no dash dash). Thus, if “->” is included in the value that is receivedat “debug”, the validation would fail. An example of a value of “debug”that could be an XSS attack is -><script>alert(‘hi’)</script><!-. Thiswould attempt to close the HTML comment context, which would allowinjection of the script including the alert and then restart anotherHTML comment that gets closed.

In another example, the context of the output is an HTML attribute witha Uniform Resource Locator (URL). The example may include the followingcode: <a href=“<%=request.getParameter(“ur”) %>”>Click here</a>. In thisexample, the output data is a special case for a HTML attribute. Here aURL inside an <a> tag href attribute does not require a [“] to launch anattack, thus the validation function can check for a JavaScript URL. Ifa JavaScript URL is present in the output when the web page 114 isexecuted, the validation would fail.

In a further example, the context of the output is JavaScript Text. Theexample may include the following code: varx=<%=request.getParameter(“value”) %>;. In one example, the output isinside of JavaScript text and would just need a “;” to add extrainstructions via XSS. Thus a context sensitive validation can be made toensure that the value is alphanumeric and does not include othercharacters or symbols. The context sensitive validation can also protectagainst other similar attacks not using “;” in this example.

In yet another example, the context of the output is a JavaScriptString. The example may include the following code: varx=‘<%=request.getParameter(“value”) %>’;. The output is inside of aJavaScript string, and thus would need an [‘] to add instructions viaXSS. Thus a context sensitive validation can be included to check tomake sure there is not a quote and/or single quote.

These are a sample of context examples. The runtime agent 112 mayinclude other contexts as well. There are various output encodingcontexts that can be implemented (e.g., eXtensible Markup Language (XML)context, HTML/XML Tag and Attribute Names, JavaScript inside HTML,Cascading Style Sheets (CSS) inside HTML, URL inside HTML, HTML insideHTML, URL parameter names and values, URL Path and File Name Parts, URLHost Name and Port, URL Protocol, CSS selectors and rules, CSSIdentifiers, CSS Property Values, CSS String Literals, URL Inside CSS,etc.). In some examples, the runtime agent 112 can be updated withinformation about contexts and/or associated validation. With thisapproach, new contexts and validations can be added to protect the webapplication 110 without updating the whole web application 110. In someexamples, the source code can be processed to determine, based on oneidentifier or multiple identifiers, the context. A validation functioncan be mapped to the context and injected as code to protect the outputfrom potential XSS.

The validation function can call an API provided by the runtime agent112 to send a security alert to the runtime agent 112 upon a validationfailure. The API can include code to provide information to the runtimeagent 112. Security alerts can be used to call a method performed by theruntime agent 112. The runtime agent can, in response to receiving thesecurity alert, perform a security action associated with the alert,such as log the activity in a log 218, block the output associated withthe failed validation function, etc. Further examples of contextsensitive validation are provided in the description of FIGS. 7 and 8.

In the example of a security policy, the runtime agent 112 can add asecurity policy header, as part of the injected code to protect theoutput, to the modified web page to invoke a browser policy response ina client-side browser 246. In one example, for a context of a staticJavaScript block, the runtime agent 112 can calculate a hash which willbe included in the security policy header when the web page 114 isserved. For a JavaScript block with dynamic content, the runtime agent112 can add a nonce attribute in the “<script>” tag, the final noncevalue can be a random value generated when the web page 114 is served.In certain examples, each HTML event can be rewritten to a “<script>”tag with a nonce value.

In the example of CSP, when the web page 114 is served, the runtimeagent 112 can dynamically add the security policy HTTP header with allthe calculated hashes for the web page 114. Further, a new random noncecan be generated and be inserted into a corresponding “<script nonce= .. . >” location. HTML events include two parts, an attribute and ascript. The attribute describes an action. When the action of the eventoccurs, the script is run. To rewrite the event, the script is includedwith an interrupt reflecting the action. Thus, similarly, when theaction of the interrupt occurs, the script is run.

Upon a successful XSS, the browser 246 will send a CSP violation reportto the web application 110. When processing the web page 114, thebrowser 246 checks the content of the web page 114 against the headerincluding the calculated hashes and/or nonce values. The browser 246hashes the same static content (e.g., a static script) and if the valueof the browser calculated hash is not equal to the previously calculatedhash in the header, a violation occurs. Similarly, when a script isexecuted by the browser, if the script is checked to determine whetherthere is a script nonce and whether the script nonce equals one of thescript nonce values in the header. The API that receives the violationreport at the web application 110 can include a program point that cancall the runtime agent to intercept the violation report and can convertthe violation report to a runtime event and can provide as a runtimesecurity event. The runtime security event can be converted to bedisplayed to a user of the device 244 a, an administrator, etc. Further,the runtime security event can be sent to a security managementplatform, which may present the runtime security event in a graph in adashboard.

There are chances that the runtime agent 112 cannot confirm if theinjected code will accidentally block any normal operations, andtherefore cause false positives of violations to be reported. A falsepositive would occur when there is no violation of the security policy,but a violation report is created. To prevent against creating falsepositives due to inadvertent blocking of normal operations by injectedcode, the runtime agent 112 will first assume CSP is supported and canissue a “Content-Security-Policy-Report-Only” header to instruct thebrowser 246 not to block any CSP violation. If the runtime agent 112does not receive any CSP violation report after a certain number of CSPcompliant browsers have been serviced, then the runtime agent 112 canconclude CSP is actually supported on that page. Otherwise, responsiveto receiving a violation report, the runtime agent 112 can cause the webapplication 110 to re-generate or re-compile the web page 114 with CSPdisabled (e.g., without the injected code to protect the output) becausethe reception of the violation report likely indicates inadvertentblocking of normal operations. In some examples, the injected code toprotect output that the runtime agent 112 receives CSP violation reportsfrom can be removed, while other injected code to protect other outputon the web page can still be implemented. The violation reports caninclude which injected code was triggered. The injected code reported astriggered can be removed, while others are still included. This isbecause the reported injected code is likely to be inadvertentlyblocking normal operations, while there is no evidence that otherinjected code is blocking normal operations. In other examples, each ofthe segments of injected code to protect output can be removed to removeany possibility that the runtime agent 112 is inadvertently causingblocking of normal operations.

In some examples, responsive to a HTTP request being sent to the webapplication 110, the web application 110 will handle the request byprocessing the corresponding web page. For example, for Java ServletPage (JSP) technology, the web application 110 would compile the JSPpage into a Java source file on the first request, and then compile theJava source file into a Java class. Subsequent requests can then beprocessed by running the previously compiled Java class directly. Thisapproach can be taken with other technologies being implemented as well(e.g., .NET, APACHE VELOCITY, etc.), where the web page 114 is generatedat a first request for the web page and future requests can be servicedby the generated web page.

The communication network 240 can use wired communications, wirelesscommunications, or combinations thereof. Further, the communicationnetwork 240 can include multiple sub communication networks such as datanetworks, wireless networks, telephony networks, etc. Such networks caninclude, for example, a public data network such as the Internet, localarea networks (LANs), wide area networks (WANs), metropolitan areanetworks (MANs), cable networks, fiber optic networks, combinationsthereof, or the like. In certain examples, wireless networks may includecellular networks, satellite communications, wireless LANs, etc.Further, the communication network 240 can be in the form of a directnetwork link between devices. Various communications structures andinfrastructure can be utilized to implement the communicationnetwork(s).

By way of example, the devices 244 a-244 n, computing system 100, etc.communicate with each other and other components with access to thecommunication network 240 via a communication protocol or multipleprotocols. A protocol can be a set of rules that defines how nodes ofthe communication network 240 interact with other nodes. Further,communications between network nodes can be implemented by exchangingdiscrete packets of data or sending messages. Packets can include headerinformation associated with a protocol (e.g., information on thelocation of the network node(s) to contact) as well as payloadinformation.

The engines, modules, and parts described herein can be distributedwithin an individual device or between more than one devices. Theengines e.g., interface engine 216 include hardware and/or combinationsof hardware and programming to perform functions provided herein. Insome examples, the web application 110 and/or the runtime agent 112 canbe implemented as an engine. Moreover, modules can include programmingfunctions and/or combinations of programming functions to be executed byhardware as provided herein. When discussing the engines and modules, itis noted that functionality attributed to an engine can also beattributed to a corresponding module and vice versa. Moreover,functionality attributed to a particular module and/or engine may alsobe implemented using another module and/or engine. Examples of modulesand engines include the runtime agent 112, the interface engine 216, andthe web application 110.

A processor, such as a central processing unit (CPU) or a microprocessorsuitable for retrieval and execution of instructions and/or electroniccircuits can be configured to perform the functionality of any of theengines and/or modules described herein. In certain scenarios,instructions and/or other information, such as rules, can be included inmemory (e.g., a computer readable medium). In some examples,input/output interfaces may additionally be provided by the devices. Forexample, input devices, such as a keyboard, a sensor, a touch interface,a mouse, a microphone, etc. can be utilized to receive input from anenvironment surrounding the devices. Further, an output device, such asa display, can be utilized to present information to users. Examples ofoutput devices include speakers, display devices, amplifiers, etc.Moreover, in certain examples, some components can be utilized toimplement functionality of other components described herein.Input/output devices such as communication devices like networkcommunication devices or wireless devices can also be considered devicescapable of using the input/output interfaces.

FIG. 3 is a flowchart of a method for modifying source code of a webpage to include injected code to protect output of the web page,according to an example. FIG. 4 is a block diagram of a computing devicecapable of modifying source code of a web application to generate amodified web page with injected code to protect output of the modifiedweb page, according to an example. Although execution of method 300 isdescribed below with reference to computing device 400, other suitablecomponents for execution of method 300 can be utilized (e.g., computingsystem 100). Additionally, the components for executing the method 300may be spread among multiple devices. Method 300 may be implemented inthe form of executable instructions stored on a machine-readable storagemedium, such as storage media 420, and/or in the form of electroniccircuitry.

The computing device 400 includes, for example, a processor 410, and amachine-readable storage media 420 including instructions 422, 424 forcalling the runtime agent and protecting the application by modifying aweb page before generation. Computing device 400 may be, for example, adesktop computer, a workstation, a server, or any other computing devicecapable of performing the functionality described herein.

Processor 410 may be, at least one central processing unit (CPU), atleast one semiconductor-based microprocessor, at least one graphicsprocessing unit (GPU), other hardware devices suitable for retrieval andexecution of instructions stored in machine-readable storage media 420,or combinations thereof. For example, the processor 410 may includemultiple cores on a chip, include multiple cores across multiple chips,multiple cores across multiple devices (e.g., if the computing device400 includes multiple node devices), or combinations thereof. Processor410 may fetch, decode, and execute instructions 422, 424 to implementmethod 300. As an alternative or in addition to retrieving and executinginstructions, processor 410 may include at least one integrated circuit(IC), other control logic, other electronic circuits, or combinationsthereof that include a number of electronic components for performingthe functionality of instructions 422, 424.

Machine-readable storage media 420 may be any electronic, magnetic,optical, or other physical storage device that contains or storesexecutable instructions. Thus, machine-readable storage medium may be,for example, Random Access Memory (RAM), an Electrically ErasableProgrammable Read-Only Memory (EEPROM), a storage drive, a Compact DiscRead Only Memory (CD-ROM), and the like. As such, the machine-readablestorage medium can be non-transitory. As described in detail herein,machine-readable storage media 420 may be encoded with a series ofexecutable instructions for protecting a web application. Further, insome examples, the various instructions 422, 424 can be stored ondifferent media.

The computing device 400 can serve a web application to other devicessuch as clients. These clients can access the web application using acombination of hardware and software such as a web browser or localapplication.

Runtime agent call instructions 422 can be included in the webapplication to call the runtime agent to modify source code of a webpage. In certain examples, the runtime agent call instructions 422include code at strategically relevant program points of the webapplication, for example, at a point of a process initiating generationof a web page from web page code. The strategically relevant programpoint in this case can be at an API. At 302, responsive to beginningexecution of a process initiating generation of the web page of the webapplication, the runtime agent is executed to modify the web page code.As noted above, the process initiating generation of the web page caninclude a process to read the web page before compiling, such as theJSPUtil.getReader( ) process in the example of Tomcat JSP noted above.The JSPUtil.getReader( ) process can be considered a part of the webpage generation initiation process as being part of the process used tocompile or generate the web page. Another part of the compile processcan be the actual compiling and/or generation.

At 304, the runtime agent can then process the source code of the webpage to modify the source code of the web page to inject code to protectoutput of the web page. The web page can be generated as a modified webpage including the injected code. As noted above, the injected code toprotect output can include context sensitive validation, a header tocause enforcement of a web page security policy such as CSP,combinations thereof, etc.

As noted above, in one example, while implementing a web page securitypolicy, adding a header to indicate to a browser to enforce a web pagesecurity policy can be implemented by determining and adding a securitypolicy hash to the modified web page before generation to invoke aparticular browser policy response. Similarly, a nonce can be added todynamic content as described above.

Further, an identifier associated with the output can be used todetermine a context of the output. The context can be used to determinethe injected code to be used and/or placement of the injected code. Asnoted above, a set of possible identifiers for identifying context in aweb page can be stored in a data structure associated with the runtimeagent. The identifiers can include identifiers that open a context andidentifiers that close a context. A context is the setting that outputof a webpage is in. For example, if a first identifier <html> followedby a second identifier <body> precedes the output and the output isfollowed by </body> and </html>, the context of the output is HTML text.A corresponding context sensitive validation function, in this exampleLibrary.checkNoTag(data), can be injected.

When the runtime agent has completed execution to modify the code of theweb page, at 306, the process initiating generation of the web page canbe executed using the modified code to generate a modified web page. Themodified web page can be served to a web browser.

FIG. 5 is a flowchart of a method to provide a modified web page capableof being used by a browser capable of implementing a security policy,according to an example. Although execution of method 500 is describedbelow with reference to computing device 400, other suitable componentsfor execution of method 500 can be utilized (e.g., computing system100). Additionally, the components for executing the method 500 may bespread among multiple devices. Method 500 may be implemented in the formof executable instructions stored on a machine-readable storage medium,such as storage media 420, and/or in the form of electronic circuitry. Aruntime agent can be implemented by executing modify instructions 424.

At 502, the computing device 400 provides a modified web page to abrowser. In this example, the browser implements a security policy(e.g., CSP). The modified web page can have a security policy header(e.g., as a hash) included as part of injected code to protect output ofthe modified web page. The header can indicate approved sources ofcontent that browsers should be allowed to load on that page, forexample, content that meets a hash included in the header and/or a noncevalue included in the header. As noted above, in certain examples, theinjected code can include a protection against XSS. The injected codecan invoke a browser policy response when a supporting browser executesthe web page. When a device asks for the web page, the web page can beprovided from a memory (e.g., if the web page was previously generated)or be generated in response to a request from the device's browser.

The browser, when executing the web page, can determine whether contenton the web page is approved based on the security policy header. Ifcontent in the output of the web page that is does not hash to a valueof a hash in the security policy header or a nonce in the securitypolicy header, then a violation has occurred. As such, the browser canlimit access to the content that is not approved (e.g., by blocking,warning the user, asking the user to override to see the content, etc.).Further, the browser can provide a violation report back to the webapplication indicating that the violation occurred. An API can be usedto send the report.

At 504, the API can call the runtime agent and the runtime agent canreceive the report from the browser. When the browser sends the reportto the web application, the runtime agent can execute in response to anAPI being called when the report is provided (e.g., via a program pointat the API) and thus receive the violation report.

At 506, the runtime agent can respond to the violation report. In oneexample, the violation report can be logged as a security event inresponse to reception of the violation report. In another example, thesecurity event can be presented (e.g., via a message or email, the log,etc.) to an administrator. Further action can also be taken, forexample, blocking access to an address associated with the contentcausing the violation in other web pages.

As noted above, there am chances that the runtime agent cannot confirmif the transformation approaches will accidentally block any normaloperations, and therefore causing false positives. To alleviate thispossibility, the runtime agent can first assume the security policy(e.g., CSP) is supported and can issue a request to not enforce thepolicy, but instead provide a report back (in CSP, by providing a“Content-Security-Policy-Report-Only” header) to instruct the browsernot to block any policy violation. If the runtime agent does not receiveany policy violation report after a threshold number of policy compliantbrowsers have been serviced, then the runtime agent can conclude thatthe policy is actually supported correctly on that page. Then, the“Content-Security-Policy-Report-Only” header can be removed. The higherthe threshold number of policy compliant browsers serviced, the moreconfidence that the security policy is supported. Otherwise, responsiveto receiving one of the violation reports, the runtime agent can causethe web application to re-generate or re-compile the web page with thepolicy (e.g., CSP) disabled. As noted above, segments of the injectedcode can be removed to guard against inadvertently blocking normaloperations.

FIG. 6 is a flowchart of a method for modifying code of a web page toinclude context sensitive validation, according to an example. Althoughexecution of method 600 is described below with reference to computingdevice 400, other suitable components for execution of method 600 can beutilized (e.g., computing system 100). Additionally, the components forexecuting the method 500 may be spread among multiple devices. Method600 may be implemented in the form of executable instructions stored ona machine-readable storage medium, such as storage media 420, and/or inthe form of electronic circuitry. A runtime agent can be implemented byexecuting modify instructions 424.

As noted above, when a web page is being generated, the runtime agentcan be called to intercept a process to generate the web page. At 602,the runtime agent determines context sensitive validation for the sourcecode of the web page. As noted above, various contexts can be determinedbased on one identifier or more than one identifiers in the source code.A set of possible identifiers for identifying context in a web page canbe stored in a data structure associated with the runtime agent. Theidentifiers can include identifiers that open a context and identifiersthat close a context. A context is the setting that output of a webpageis in. For example, if a first identifier <html> followed by a secondidentifier <body> precedes the output and the output is followed by</body> and </html>, the context of the output is HTML text. In oneexample, the context sensitive validation can be determined based on amapping of the context to the validation as noted above. At 604, thecontext sensitive validation can be included into the web page sourcecode as injected code to protect output before the web page isgenerated.

When a browser executes the web page, the context sensitive validationcan be executed. If the validation passes, the browser continues to loadfor the user. If the validation is determined to fail, at 606, thebrowser can provide the information that the validation failed back tothe application via an API. The API can include code to notify theruntime agent of the information at 608. Thus the runtime agent isprovided the information that the validation failed as a security alert.The runtime agent can insert the injected code at the API to call theruntime agent when the API receiving the alert is executed. The runtimeagent can then log the security event, report the event with a message,etc. Further action can be taken as well, for example, the runtime agentmay determine an identifier associated with the content (e.g., IPaddress) that can be used to blacklist associated content.

FIG. 7 is a diagram representing an unmodified section of code thatincludes output at a web page, according to an example. FIG. 8 is adiagram of a modified section of the code of FIG. 7 that includes outputat a web page, according to an example. FIGS. 7 and 8 are shown usingpseudo code to show how code can be injected into a web page to protectoutput.

The web page code 700 can include starts to scripts identified by<script> 701 a, 701 b and stops to scripts identified by </script> 703a, 703 b. In this example, the script 701 a is a static script andscript 701 b is a dynamic script. Further, the code includes a formidentified at 705 and is closed at 707. During processes describedherein, the web page code 700 can be converted to web page code 800 toprovide added protection to output.

As noted above, compatibility for a browser security policy can beimplemented. In web page code 800, a security policy header 801 can beadded that includes information about particular content (in thisexample scripts) that are allowable content. For a static script such asscript 803 a, a static hash can be calculated 803 b and added to thepermissible content in the header 801. For dynamic scripts 805 a, 805 ba nonce can be generated and added. The nonce can be a random value. Thenonce can be indicated in the header 801 at 805 c. In one example, theform 705 of FIG. 7 can be modified from being written as an event to berewritten as a script in FIG. 8 to enable enforcement of the securitypolicy. In the example, an action, a submit action on the form, triggersthe event to activate. In the script form, an interrupt is used to trackthe submit action to call the script. In the example shown, the securitypolicy related to CSP.

Additionally or alternatively, context sensitive validation can beinjected at 821, 823, 825, 827 of web page code 800. The validation at821 can be used to ensure that variable “name” is a string due to itscontext. Further, validation 823 can be used to check that there is no“<” due to the location of the use of “name” in an HTML header.Moreover, validation 825 can check that there is not a quote in “data,”which could lead to possible XSS vulnerability. Further, validation 827can similarly check that there is not a quote in “name”, which couldlead to possible XSS vulnerability.

What is claimed is:
 1. A method comprising: responsive to initiation of a compile process for generation of a web page of a web application at a server, executing a runtime agent; prior to execution of the compile process, modifying, by the runtime agent, code of the web page to include injected code to protect output of the web page, the injected code including a call to execute a method of the runtime agent and a context sensitive validation function; and executing the compile process using the modified code to generate a modified web page.
 2. The method of claim 1, wherein the injected code includes a security policy header to invoke a browser policy response.
 3. The method of claim 2, further comprising: providing the modified web page to a browser, wherein the compile process is initiated in response to a request from the browser.
 4. The method of claim 3, further comprising: receiving a violation report at the runtime agent from the browser; and responsive to receiving the violation report, logging the violation report as a security event.
 5. The method of claim 4, further comprising: providing another header to the browser, the another header to set the browser to report, but not enforce, a browser policy as part of the browser policy response; and recompiling the web page to generate a dynamically generated web page without the injected code in response to the violation report.
 6. The method of claim 1, further comprising: determining, by the runtime agent, a context sensitive validation for the code using the context sensitive validation function.
 7. The method of claim 6, further comprising: executing the context sensitive validation function during execution of the modified web page; and providing the runtime agent an alert of a validation failure in response to a determination that the validation failure has occurred.
 8. A computing system comprising: a physical processor implementing machine readable instructions that cause the computing system to: serve a web application; responsive to initiation of a compile process to generate a web page of the web application, execute a runtime agent; prior to execution of the compile process, modify, by the runtime agent, code of the web page to include injected code to protect output of the web page, the injected code including a call to execute a method of the runtime agent, wherein the injected code includes a context sensitive validation function; and compile the modified code to generate a modified compiled web page.
 9. The computing system of claim 8, wherein the physical processor implements machine readable instructions that cause the computing system to: provide the modified compiled web page to a browser, wherein the injected code includes a security policy header to invoke a browser policy response; receive a violation report as part of the browser policy response at the runtime agent; and responsive to receiving the violation report, log the violation report as a security event.
 10. The computing system of claim 9, wherein the physical processor implements machine readable instructions that cause the computing system to: provide a second header to the browser, the second header indicating to the browser to report, but not enforce, a browser policy as part of the browser policy response; and responsive to receiving the violation report, recompile the code of the web page to generate another web page without the injected code.
 11. The computing system of claim 8, wherein the physical processor implements machine readable instructions that cause the computing system to: determine, using the context sensitive validation function, a context sensitive validation for the code based on an identifier in the code related to the output; and include the context sensitive validation in the injected code.
 12. The computing system of claim 11, wherein the physical processor implements machine readable instructions that cause the computing system to: determine that a validation failure occurred for the output during execution of the modified compiled web page; and provide a security alert to the runtime agent indicating the validation failure.
 13. A non-transitory machine-readable storage medium storing instructions that, if executed by at least one processor of a device, cause the device to: responsive to initiation of a compile process to generate a web page of a served web application, execute a runtime agent; and prior to execution of the compile process, modify, by the runtime agent, code of the web page to include injected code to protect output of the web page based on an identifier associated with the output, the injected code including a call to execute a method of the runtime agent, wherein the injected code includes a context sensitive validation function, wherein when the web page is compiled, a modified compiled web page that includes the modified code is generated.
 14. The non-transitory machine-readable storage medium of claim 13, further comprising instructions that, if executed by the at least one processor, cause the device to: receive a violation report from a browser at the runtime agent in response to provisioning by the served web application of the modified compiled web page to the browser; and log the violation report.
 15. The non-transitory machine-readable storage medium of claim 13, further comprising instructions that, if executed by the at least one processor, cause the device to: determine the context sensitive validation function for the code based on the identifier; determine that a validation for the output failed; and provide a security alert to the runtime agent describing a failure of the validation.
 16. The non-transitory machine-readable storage medium of claim 13, wherein the injected code includes a security policy header to invoke a policy response of a browser displaying the modified compiled web page.
 17. The non-transitory machine-readable storage medium of claim 16, wherein the security policy header includes a hash value calculated by the runtime agent.
 18. The non-transitory machine-readable storage medium of claim 13, wherein the call is to execute the method of the runtime agent to record a security event.
 19. The method of claim 1, wherein the call is to execute the method of the runtime agent to record a security event.
 20. The computing system of claim 8, wherein the call is to execute the method of the runtime agent to record a security event. 